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ABSTRACT 

All 7-ray telescopes suffer from source confusion due to their inability to focus incident high-energy 
radiation, and the resulting background contamination can obscure the periodic emission from faint 
pulsars. In the context of the Fermi Large Area Telescope, we outline enhanced statistical tests for 
pulsation in which each photon is weighted by its probability to have originated from the candidate 
pulsar. The probabilities are calculated using the instrument response function and a full spectral 
model, enabling powerful background rejection. With Monte Carlo methods, we demonstrate that the 
new tests increase the sensitivity to pulsars by more than 50% under a wide range of conditions. This 
improvement may appreciably increase the completeness of the sample of radio-loud 7-ray pulsars. 
Finally, we derive the asymptotic null distribution for the -ff-test, expanding its domain of validity to 
arbitrarily complex light curves. 

Subject headings: methods: statistical — methods: data analysis — pulsars: general — gamma rays: 
general 
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1. INTRODUCTION 

Though the detection and characterization of periodic 
emission from neutron stars has h istorically been the 
province of radio astronomers (e.g. iHewish et al.l 119681 : 
iBacker et al.lll982t iManchester et al.ll2001h . increasingly 
sensitive instruments have enabled the study of pulsars at 
high energy. Neutron stars accreting near their Edding- 
ton limit were detected in X-rays as accretion-powered 
pulsars dWhite et all [l983h in the 1970s by UHURU 
(jGiacconi et al.lll97if ). while the sensitive ROSAT and 
ASCA missions detected faint magnetospheric emission 
from a population of rot ation-powered X-ray pulsars 
(|Becker fc TruemperlH997l ). 

At even higher energies, the handful of well-k nown 
7-ray pul sars — e.g., Vela ([Thompson et ail I1975D and 
Geminga (jBertsch et al.l I1992D — have been joined by a 
host of new pulsars detected by the Fermi Large Area 
Telescope (Fermi-LAT). While the superb sensitivity of 
Fermi-LAT has facilitated the firs t discov eries of pulsars 
in 7 rays alone (jAbdo et al.ll2008l I2009H) . a foundation 
of pulsar science with Fermi-LAT is the extensive sup- 
port from the r adio community. E .g., the Pulsar Tim- 
ing Consortium (jSmith et al.ll2008l ) generates timing so- 
lutions for over 200 pulsars with high spindown luminos- 
ity, E > 10 34 erg s . These timing solutions enable the 
long integrations necessary to detect periodicity in sparse 
Fermi-LAT photons, and so far more than 30 timed pul- 
sars have be en detected in 7 rays (|Abdo et al.1 [2009d, 
[20105 l2009al e.g.). 

Although the precise emission geometry of pulsars is 
still unknown, it is not unreasonable to beli eve that ~ 
50% of radio-loud pulsars are also 7-ray loud (jRavi et al.1 
2010), a fraction that may be even larger for millisecond 
pulsars. If this is indeed the case, then many of the 
luminous radio-timed pulsars are visible in 7 rays but 
below the current sensitivity of the LAT. While Fermi- 
LAT continuously observes the GeV sky, the flux above 
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which pulsars are detectable only decreases as t^ 1 / 2 . Ten 
years of observation will only decrease the current flux 
threshold by a factor of ^2. 

Improved analysis techniques can beat this rate, and 
in this vein we outline better st atistics for testing f or pe- 
riodicity. To date, such tests (jAbdo et al.l 12010a ) have 
us ed only the a r rival t ime/phase of a photon. As shown 
bv lBickel etUI (|2008L hereafter BKR08), the additional 
data available — the photon's reconstructed energy and 
position — allow the calculation of a probability that the 
given photon originated from the candidate pulsar, and 
that incorporating this probability into the test statis- 
tics helps reject background and increase the sensitiv- 
ity to pulsations. Although we focus on the applica- 
tion of the technique to LAT data, we note that the 
scheme is applicable to any photon-counting instrument 
in which sources are not perfectly separated from their 
background, e.g. searches for X-ray pulsation in obser- 
vations of a pulsar embedded in a pulsar wind nebula. 

The paper is organized as follows. We begin in Section 
[5] by giving an overview of a family of statistics — based 
on trigonometric moments and formulated as a score test 
for pulsation — that includes the weighted pulsation tests 
discussed here. In Sections 12. R we review the Z^and 
if-test statistics and define modified versions incorpo- 
rating weights. We outline the calculation of probability 
weights appropriate for the LAT in Section [3] and in Sec- 
tion [?] we demonstrate the superior performance of the 
weighted versions of the Z^and -ff-test statistics. In the 
Appendix, we derive the asymptotic calibration for the 
-ff-test, a new result expanding the scope of the test. 

2. STATISTICAL TESTS FOR PERIODICITY 

A timing solution (ephemeris) defines a map from pho- 
ton arrival time t to phase <f>, e.g. neutron star rotational 
phase. The flux from a pulsar can be written as 

oo 

f(<f>, E) cx 1 + n a k cos(27rfc0) + j3 k sin(27r/c0). (1) 
fe=i 
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For simplicity, we have assumed the pulsar spectrum is 
independent of phase. 

The null hypothesis — no pulsation — is given by r) = 0. 
By considering the likelihood for photon arrival times, 
BKR08 derived a test statistical for r) > 0, 



(2) 



fe=i 



where T is the total integration time and 



&k = E Wi cos(27rfcc/>i); f3 k = ^ Wi sin(27rfcc/>i), (3) 

i=l i=l 

where the sum is over the list of n photons and Wi is 
some weight. (See Eq. 17 of BKR08, from which this 
form is adapted.) For Wi — 1/n, these are estimators of 
the trigonometric moments of the distribution and Monte 
Carlo estimators for the coefficients of the Fourier trans- 
form. If Wi is the probability that a photon comes from 
the pulsar, the weights are optimal in the sense that Q 
is a score test, which is locally most powerful. Moreover, 
the statistic is manifestly invariant under phase shifts 
<fi — > <j> + 5<j>. (See also the result of iBeranl JT969), who 
considered a similar class of statistics.) From Central 
Limit Theorem arguments, BKR08 show that for a finite 
collection of m harmonics, Q has an asymptotic distri- 
bution of xl m - 

2.1. The Z 2 m Test 

A simple realization of such a statistic, with u>; = 1 
and afc = /3k = 1 for k < m and a k = fik — for k > m, 
known as the Z 2 statistic, 



k=l 



■01, 



(4) 



has been a workhorse for searches for 7-ray pulsars. 
(Note the change in normalization: 1/T f=s 1/n.) A Z\ 
test was used in a search for pulsations in COS-B data us- 
ing ti ming solutions for 145 radio pulsars dBuccheri et al.l 
119831 ). A similar search of EGRET data (|Nel et alJl!99ffl 
used tests with 1, 2, and 10 harmonics, the -ff-test 
(see below), and the "Zf+4" test which is defined as 
above but with summation restricted to the 2nd and 4th 
harmonics. Z^ forms an integral part of the if-test, 
and continues to se e use in analysis of Fermi-LAT data 
(|Abdo et al.H20T0cT) . 

From the discussion above, we expect Z^ to be dis- 
tributed as X2m m the null case, but this result holds if 
and only if the a k and /3& coefficients are statistically in- 
dependent. However, in some cases they may be highly 
dependent. For a single observation, a k (/?&) can be in- 
verted to find the original rv, </>, and thus all coefficients 
are algebraically determined. For large samples from a 
uniform distribution, however, a given coefficient conveys 
little information about the underlying {&} and the co- 
efficients are approximately independent. Although the 
requirement on the detector response assures uniformity, 

2 The form of the statistic presented here requires that the pulsar 
period be short relative to the timescale on which the detector 
response changes. This is so for the Fermi-LAT. 



if the null distribution is peaked, the coefficients remain 
correlated for arbitrarily large sample sizes. 

To determine the minimum sample size required to 
reach the asymptotic \ 2 distribution, we performed a 
Monte Carlo study of the convergence as a function of 
both sample size and maximum harmonic (m) , shown in 
Figure [T] Evidently, a sample size of at least 50 phases 
is required for robust significance estimation at the 3tr 
level, and convergence appears to improve for higher val- 
ues of m. 

The defintion of the Z^can be viewed as a set of hard 
cuts for which Wi = 1 for some set of photons and Wi — 
for all other photons. By allowing arbitrary values for 
Wi we implement soft cuts in which we ideally assign 
higher (lower) weights to photon associated with the pul- 
sar (background). We thus define the weighted Z^&s 



1 



E- 2 

i=l / 



E< 

k=i 



pi 



(5) 



As a realization of the test statistic of Eq. [21 its cali- 
bration remains xtm- Again, note the expression of the 
normalization in terms of the data; the relation can be 
seen in terms of a random walk. 

2.2. The H- Test 

It is clear from Eq. [2] that for an optimal test, the em- 
pirical coefficients should be weighted by the true Fourier 
coefficients, which are a priori unknown. The estima- 
tion of unit coefficients up to a maximum harmonic m 
in the Z^is crude, and choosing m too small will result 
in a loss of power against sharply-peaked light curves, 
while m too large will lose power again st broad, sinu- 
soidal light curves. To improve on this, Ide Jager et al.l 
(1989) proposed the iJ-test, which estimates m from the 
data. They defined 



c x (i — 1)] , 1 < i < m 



(6) 



and specifically recommended m = 20 and c = 4 as 
an omnibus test. They provided a Monte Carlo cal- 
ibra tion of the tail probabilit y and, in a recent pa- 
per (|de Jager fc Bu sching 2010f l provide an estimate 1 — 
FH 2Q {h) ~ exp(— 0.4 x h) good to h « 70. We derive 
the analytic, asymptotic calibration for all values of m, 
c, and h in the Appendix and use this calibration any- 
where a conversion from h to chance of Type I error (i.e., 
"a") is needed. 

This calibration depends only on the asymptotic x\m 
calibration of Z^, and so it also holds for Z^ w and thus 
for a weighted H-test statistic 



H mw = max [Zf w — c x (i — 1)] , 1 < i < 



(7) 



We adopt the original values m = 20 and c = 4 below. 

3. CALCULATING PHOTON PROBABILITIES 

The optimal choice of weight is the probability that a 
photon originates from the pulsar, and we outline the 
calculation of this quantity for the Fermi-LAT . We have 
available three pieces of information, viz. the photon's 
reconstructed energy, position, and arrival time. For pul- 
sars, the arrival time is necessary for computation of the 
phase but can otherwise be ignored except in rare cases, 
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1,000,000 Monte Carlo Trials 




TS = Z^/(2m) TS = Z^/(2m) TS=Z^/(2m) 



Figure 1. The observed distribution for the Z„ statistic for varying sample size and maximum harmonic (m). The asymptotic calibration 
is shown as a solid line, while the solid band gives the empirical distribution function of the Monte Carlo realizations. The width indicates 
the statistical uncertainty estimated as y N(>= TS). 



e.g. a candidate near a bright, highly variable blazar. We 
thus consider time-averaged quantities only. (Recall we 
have assumed phase-independent pulsar sp ectra. Spec- 
tra ge nerally are dependent on phase (e.g. lAbdo et"aLl 
l2010dD . but such using a phase-averaged spectrum in- 
curs little error.) 

Probabilities constructed using only the pho- 
ton position have been emplo yed profitab l y for 
COS-B and EGRE T analyses tiBrown et alj U99l 
Ramanamurthv et al.l 119961 iMcLaughlin fc Corded 
2003ft . This approach has the advantage of requiring 
no knowledge of source spectra and works well for a 
detector with an excellent and/or energy- independent 
psf. The Fermi-h AT , on the other hand, has a com- 
paratively poor psf that varies by more than two 
orders of magnitude between 100 MeV and 100 GeV 
(jAtwood et al.ll2~009| ). A typical source will have many 
background photons within the psf radius at low energy 
but very few at high energies, and a probability based 
solely on position cannot discriminate against the many 
low energy background photons. 

We address the strong dependence of S /N on energy by 
including the estimated photon energy in the probability 
calculation. Briefly, a point source is characterized by 
its photon flux density (ph cm~ 2 s _1 MeV -1 ) which we 

model as ^(E, A), with E the energy and A some set of 
parameters (e.g., the normalization and photon index for 
a power law; these parameters will typically be estimated 
via maximum likelihood analysis as discussed in the fol- 
lowing section). We assume the source is stationary. If 

the exposur^3 to the position of the source, f2n, is given 
by e(E,(lo) (cm 2 s), then the expected differential rate 
(ph MeV -1 sr _1 ) in the detector from the jth source is 

r 3 (E,n) = T(E,\)e(E 1 n ) f psl (n-n (h E), (8) 

where / ps f (J7; S!o> E) denotes the psf of the instrument 
for the incident energy and position. While the psf only 
depends on incidence angle, the highly structured diffuse 

3 The exposure calculation for the Fermi-LAT involves summing 
the detector response over the pointing history of the spacecraft. 



background motivates preservation of the full position. 

For a photon observed with energy near £ at a posi- 
tion near fl, the probability that it originated at the jth 
source is simply 

Wj ( E ,m^ ; AEAXj) , (9) 

£n(E,n,\i) 

where N s is the number of contributing sources. The 
weight can approach 1 for bright pulsars and at high 
energies, where the LAT psf becomes narrow. 

4. PERFORMANCE IN FERMI-LAT PULSATION 
SEARCHES 

The primary goal in adopting the weighted versions of 
pulsation test statistics is, of course, to find more gen- 
uine pulsars. In statistical language, we want to min- 
imize Type I Error (false positives) and Type II error 
(false negatives). We address these two in turn by com- 
paring weighted and unweighted versions of the Z 2 n and 
H-test statistics. Although BKR08 provide an analytic 
expression for the increased detection significance offered 
by photon weights (see Eq. 23 of that work), it is not 
suitable for determining the global performance improve- 
ment since the optimal data selection for the unweighted 
tests is unknown a priori. We thus assess the perfor- 
mance of the tests on an ensemble of simulated pulsars. 

4.1. Simulation Details 

The Fermi-LAT Science Too0 gtobssim uses a detailed 
characterization of the instrument response function to 
simulate events from modeled point and diffuse sources. 
We simulate a point source — the candidate pulsar — with 
a realistic pulsar spectrum, 

^ cx (E/GcV)- r cM-E/E c ) (10) 

with r = 1.5 and E c = 3 GeV. This spectrum is typ- 
ical of many pulsars and is approximately that of the 

4 http: / / fermi.gsfc.nasa.gov / ssc / data/analysis / scitools / overview.html 
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middle-aged Vela pulsar (|Abdo et al.ll2009cl ). We place 
the source at (R.A., Decl.) = (128.8463, -45.1735), 
the position of Vela. For the background, we simu- 
late photons from the two diffuse background model s 
used in the 1FGL catalog analysis (jAbdo et al.l l2010ah . 
glLiem_v02 — a model of the Galactic diffuse background 
due to cosmic rays — and isotropic-iem_v02 — an isotropic 
background including contributions from unresolved ex- 
tragalactic point sources and instrumental backgrounds. 
In all, we simulate one year of integration using the space- 
craft pointing history from 2009. 

The normalization of Eq. [10] is chosen to yield a bright 
source with an integral photon flux from 100 MeV to 100 
GeV of J- sim = 1(V 5 ph cm -2 s _1 . We are interested in 
detection of dim sources, so from this set of photons we 
select subsets emulating sources with appropriately lower 
fluxes, e.g. Ttar = 10~ 8 ph cm~ 2 s _1 , by (a) drawing the 
target number, N tar , of photons from a Poisson distribu- 
tion with mean N to t X Ftar/Fsim an d (b) selecting a sub- 
set of N t ar of the original photon set at random (without 
replacement). In this way, we can generate ensembles of 
statistically independent point sources over a range of 
fluxes from a single Monte Carlo data set. While we use 
a single realization of the diffuse photons, we effectively 
generate a new iteration by randomizing these photons 
in phase for each ensemble member as we discuss below. 

To determine the weights (Eq. [9]), we employ gtsr- 
cprob. This Science Tool combines the source models 
(those used to generate the Monte Carlo data) with the 
instrument response function to determine the observed 
source rates and hence the weights. These weights are 
valid for the simulated point source flux J-5, and must 
be scaled for the dim ensemble members: w^ ar — 1 = 

( w sim — 1) x Fsim/J-'tar- 

The Monte Carlo events as generated by gtobssim have 
no pulsation. During simulation, each photon is "tagged" 
with an identifier for its originating source. We assign 
phases from a uniform distribution to the diffuse back- 
ground, and for the point source we draw phases from 
an assumed light curve, typically a normalized sum of 
wrapped Gaussians. 

4.2. Performance: Type 1 Error 

Type I error stems from two sources. First, there is 
the chance of a fluctuation in the test statistic (TS) suf- 
ficiently large to pass the established threshold for rejec- 
tion of the null hypothesis, i.e., claiming detection of a 
pulsar. As long as we understand the null distribution of 
the TS, this particular source of error is easy to control: 
we simply determine in advance our tolerance to false 
positives and set the TS threshold accordingly. We must 
be cautious about applying the asymptotic calibration 
of the null distribution to small sample sizes. For these 
cases, it is important to verify the chance probability 
with a Monte Carlo simulation. 

A second, more insidious source of error arises from 
the strong influence of the data selection scheme on the 
pulsed S/N, this dependence stemming from the energy- 
dependent psf, source confusion, the strong Galactic dif- 
fuse, and the exponential suppression of pulsar emission 
above a few GeV. To find the best cuts, one is tempted 
to use the TS itself as a metric, a procedure which inval- 
idates its calibration. Failure to account for this change 
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Figure 2. A comparison of the dependence on event selection of 
the H20W and H20 statistics. Two members of an ensemble (gen- 
erated as described in the main text) with flux 10 — 8 ph cm -2 s — 1 
and a single-peaked light curve of Gaussian shape (cr = 0.03) are 
shown, one in each row. The two test statistics were calculated 
over a grid of data selection criteria: the y-axis gives the extrac- 
tion radius (maximum angular distance from the pulsar) and the 
x-axis indicates the threshold energy below which photons are ex- 
cluded. The lefthand (righthand) columns shows the results for the 
weighted (un-weighted) test statistic after conversion to a units and 
normalization to the maximum observed value. 

increases the probability of false positives. Stringent cuts 
may also make the asymptotic calibration poorer. 

Probability-weighted statistics eliminate these prob- 
lems. Since the weights naturally go to zero for photons 
with neglible signal, one could in principle include all 
LAT data in the TS. (In practice, little signal is contained 
in photons more than 2° from the source.) Weighting 
provides an amorphous, optimal selection reflecting, e.g., 
the proximity of the Galactic plane or the estimated pul- 
sar cutoff energ 30. And this single selection incurs no 
probability of Type I error beyond that due to statistical 
fluctuations. Finally, the weighted statistic is less suscep- 
tible to small-sample effects since all relevant information 
is included, though for particularly weak signals Monte 
Carlo validation remains important. 

To make these claims concrete, we compare the 
weighted if- test (#20™) and the standard if -test (H20) 
computed over a grid of cuts on photon position and en- 
ergy (Figure [5]). The unweighted statistics show strong 
TS peaks for certain energy thresholds and extraction 
radii, and these peaks vary from realization to realiza- 
tion, precluding an a priori calculation of an optimal 
aperture. The weighted statistics, on the other hand, 
are largely insensitive to the data selection and perform 
best for a simple prescription: use as many photons as 
is practical. We are thus free to use the same loose cuts 
for all sources, maintaining good performance (peak TS) 
without the need to tune. 

4.3. Performance: Type II Errors / Sensitivity 

We must also consider Type II error — false negatives. 
In astrophysical terms, this error rate is essentially the 

5 In this regard, the weights deliver a data set similar to that 
proposed by Mavcr-HaBclwandcr & Ozcl (1983), who developed an 
algorithm for determining an optimal aperture with an arbitrary 
"edge" determined by the local signal-to-noise ratio 
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Figure 3. The dependence of significance in a units as a function 
of flux. The reported value at a given flux is that attained by 
at least 68% of the ensemble of 50 MC realizations. The trend 
is approximately linear, as explained in the main text, making 
it simple to invert the relation. The light curve here is a single 
Gaussian peak with a = 0.03. 

sensitivity of the method, i.e., the flux threshold above 
which a typical source will be significantly detected in 
a given observation. Indeed, we adopt this approach to 
define the Type II error. Given a particular light curve, 
we generate ensembles of sources at a series of increasing 
fluxes until we identify the flux at which a given frac- 
tion of the ensemble has a detection significance above 
threshold. 

More specifically, to compute the detection signifi- 
cance, we calculate the tail probability of the asymp- 
totic distribution for the TS in question convert it to 
(two-tailed) a units, i.e., a chance probability of Xa is 

I — (27r) -1 / 2 J_ x dx exp(— x 2 /2). We determine the flux 

threshold as that for which 68% of the ensemble deliver 
a a value above a pre-determined threshold (see below). 
This is the detection flux threshold. To estimate the 68% 
level robustly, we fit the ensemble values with a normal 
distribution and report the appropriate quantile. 

To determine the flux threshold, we need to invert the 
calculated quantity (tail probability in a units) to the 
desired quantity (flux threshold). Surprisingly, the rela- 
tionship between the two is linear. Significance canon- 
ically scales as y/n, or in this case, since we integrate 
for exactly one year, \/~T , However, here we increase the 
source flux without increasing the background flux, so we 
also increase the S/N, and in the background-dominated 
regime, significance is proportional to the S/N. Thus, 
a oc J 7 , demonstrated in Figure [3] 

We choose 4cr as the threshold which causes us to dis- 
card the null hypothesis, i.e., to claim detection of peri- 
odic emission. This threshold corresponds to a very small 
probability of Type I error, about I in 15800. Selecting 
an even higher threshold — and consequently applying the 
asymptotic calibration well into the distribution's tails — 
is not appropriate for these small sample sizes. 

To give a modest survey of the statistics outlined 
above, we present weighted and un-weighted versions of 
H20, Zf 2 , and Z\. Recall that 7?20 is an omnibus test 
that depends little on light curve morphology, whereas 
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Figure 4. The 68% flux detection threshold based on a 4<j detec- 
tion criterion for a single-peaked Gaussian light curve as a function 
of duty cycle. Light curves with narrow (broad) peaks lie to the left 
(right). H20, Z\2' an d Z% are *' le unweighted statistics with a sin- 
gle data extraction, and WH20, WZ^, and WZS are the weighted 
versions of these statistics. GH is the gridded, unweighted H20 
statistic. 

Zf 2 {Z\) should perform well for light curves with sharp 
(broad) features. For the unweighted versions of these 
tests, we select photons with E > 200 MeV and an angu- 
lar separation from the candidate < 0.8°. However, since 
no single extraction criterion will yield optimal values for 
the unweighted statistics, we also include a "grid search" 
for #20 ■ That is, for each source, we extract photons 
with E > E t h and 6 < 0th, i-e., reconstructed energies 
above E t h and reconstructed positions separated from 
the true position by less than 6 t h- We do this for a grid 
of E th and 9 th with E th /MeV € (100, 178, 316, 562, 1000) 
and 0th/deg. £ (0.5,0.625,0.75,0.875,1.0). We take the 
maximum resulting test statistic, convert it to chance 
probability, multiply by 25 (the number of "trials"), and 
convert this quantity to a units. Finally, for the weighted 
statistics, we select photons with E > 100 MeV and 
9 < 2°. 

The primary result, shown in Figure El shows the flux 
threshold for each method as a function of "duty cycle" , 
the fraction of the full phase for which there is appre- 
ciable pulsed emission. In this case, the light curves are 
single Gaussian peaks with a variety of values for their a 
(standard deviation) parameter; the templates are shown 
in Figure [5] For a fixed flux, increasing the duty cycle 
decreases the peak flux, or S/N, and the primary depen- 
dence is then an inverse relation between flux threshold 
and duty cycle. However, there is additional dependence 
from the nature of each test. It is clear, e.g., that the H20 
and Zf 2 perform significantly better than Z\ for low duty 
cycle sources, while Z\ maintains a slight edge for broad 
light curves. The iJ-test does a good job for all duty 
cycles. More importantly, it is clear that the weighted 
statistics enjoy a lower threshold for detection — by a fac- 
tor of 1.5 (grid search) to 2.0 (single cut) — than their 
unweighted counterparts. 

4.3.1. Two-peaked light curves 

Many pulsar light curves disp lay two peaks, often sep- 
arated by about 0.4-0.5 cycles (|Abdo et al.ll2010d) . We 
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Figure 5. The templates used for the single- and double-peaked 
light curves in the determination of the detection flux thresholds. 
The functional form is a wrapped Gaussian, f(4>) = Sfc— oo ff( < /' + 
i) with g{<f>,n,a) = (2tt)- - 5 exp(-0.5 (0 - n) 2 /a 2 ). In the first 
panel, \x = 0.5, while in the second panel, fii = 0.25, and 112 = 0.70. 
In this panel, the ratio of the peak heights is 3/2, and the peak 
widths are identical. 
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Figure 6. As Figure [4] but for a double-peaked Gaussian light 
curve. 

repeat the analysis of the previous section using a tem- 
plate comprising two Gaussian peaks separated by 0.45 
in phase. Real pulsar peaks often have unequal intensi- 
ties (with an energy- dependent ratio.) We reflect that 
here with a slightly-dominant leading peak; see Figure [SJ 
As seen in Figure HI the overall flux thresholds are un- 
surprisingly increased: at a fixed flux, spreading photons 
between multiple peaks decreases the overall S/N. The 
weighted statistics maintain a comfortably decreased flux 
threshold with respect to the unweighted methods. 

4.3.2. Pulsars with DC Emission Components 

Some pulsars emit an appr eciable flux of imp ulsed 7 
rays, e.g. PSR J1836+5925 (|Abdo et all l2010bh . This 
steady emission is a confounding factor for the weighted 
tests, since unpulsed source photons receive the same 
high probability weights as pulsed photons but are dis- 
tributed uniformly in phase, decreasing the effective S/N. 



Figure 7. The 68% flux detection threshold based on a 4<j detec- 
tion criterion for a single-peaked Gaussian light curve as a function 
of duty cycle. Here, the pulsed fraction has been decreased to 50%, 
i.e., half of the photons from the source are uniformly distributed 
and half are drawn from the Gaussian light curve. The flux thresh- 
old for the weighted statistic is 1.5 — 2.0 times lower than the un- 
weighted statistics, with some slight dependence on duty cycle and 
statistic. 

However, for modest ratios of pulsed to unpulsed flux, the 
performance of the weighted test is not unduly dimin- 
ished. In Figure [3 we have examined the flux threshold 
for a single-peaked light curve with equal contributions 
of pulsed and unpulsed emission. As expected, we see 
an overall increased flux threshold of about 2 over the 
fully-pulsed source (c.f. Figure @|. Despite a slight in- 
crease in the ratio of weighted-to-unweighted thresholds, 
the weighted statistics still offer appreciably improved 
sensitivity. 

4.3.3. Effect of Uncertainties in Spectral Parameters 

From these demonstrations, it is clear that the 
probability-weighted statistics have, on average, a factor 
of 1.5 to 2 improved sensitivity relative to the unweighted 
versions. One potential objection is that we have used 
the known spectra in determining the photon weights, 
whereas with real data, of course, we must first estimate 
the spectra of source and background. To assess the im- 
pact of using estimated parameters with concomitant un- 
certainty, we calculate the weights based on parameters 
estimated via max imum likeli hood spectral analysis with 
the pointlike tool (jKerrll20liT ). 

For this test, we simulated 20 realizations of the Vela- 
like point source. From Figure HI we see that the detec- 
tion threshold is about sa 8 x 10~ 9 ph cm' 2 s -1 , and we 
select this for the ensemble flux. This flux yields about 
100 source photons for the 1-year integration period, al- 
lowing for asymptotic calibration. Since the background 
strongly affects the spectral analysis, independent real- 
izations of the diffuse background are important and ac- 
cordingly we also simulated 20 realizations of the diffuse 
sources. To the point source data we added phase from a 
single-Gaussian light curve with a = 0.03, while we gen- 
erated uniform random phases for the diffuse photons. 

To assess the impact of uncertain parameters, we first 
calculate the probability weights using the known model 
parameters for the pulsar and the diffuse background, 
i.e., the "ideal" case. We then perform a maximum like- 
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Figure 8. The significance for each member of the ensemble de- 
scribed in the text calculated using weights derived from the model 
used to simulated the data (the Monte Carlo "truth" ) and derived 
from the ML model. The ML-derived values are very similar to 
those obtained using the known spectrum. 

lihood spectral fit to estimate the spectral parameters. 
Since the simulated point source is very dim relative to 
the background, it is impossible to fit all three param- 
eters. We therefore fix the cutoff energy to fOO GeV, 
effectively a power law spectrum. This approach is con- 
servative since we are now using an incorrect spectral 
model. Using the best-fit values for the flux density and 
the photon index, we calculate a new set of probability 
weights. Finally, we compute Hi<a w to determine the sig- 
nificance (a) using the "ideal" weights and (b) using the 
"measured" weights. 

We compare the results for the two cases — in a units — 
in Figure HI In general, the detection significance ob- 
tained with the "measured" weights is slightly lower than 
that obtained with the "ideal" weights, although in a 
few cases statistical fluctuations lead to the opposite 
outcome. Comparing the population means, the over- 
all significance using measured weights is decreased to 
92% of the ideal case, corresponding to an increase in 
flux threshold of ^10%. This effect is small compared 
to the factor of 1.5 — 2.0 increase in flux threshold seen 
between the weighted and unweighted versions of the H- 
test. We therefore conclude that — even accounting for 
uncertainties in the spectral parameters used to calcu- 
late the probability weights — weighted statistics offer a 
significant improvement in sensitivity. 

4.3.4. Comparison of Pulsed and Unpulsed Detection 
Thresholds 

The machinery established above — performing spectral 
fits on an ensemble to compute the weighted statistics us- 
ing probabilities estimated from an ML fit — also provides 
for directly comparing the DC (unpulsed) source signifi- 
cance with the pulsed significan ce. To est i mate the DC 
significance, we use the result of Chcrnof| (|1954J ) for the 
asymptotic calibration for the likelihood ratio for a sin- 
gle parameter whose null value lies on a boundary. To 
apply this calibration, we take the free parameter to be 
the flux density (which has a boundary at 0) and fix the 
photon index T = 2.0. The cutoff remains fixed at 100 
GeV. With this convention, the DC significance is then 
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Figure 9. A comparison of the significance of pulsed detection 
versus unpulsed detection. The pulsed detection significance was 
calculated using the ML best-fit spectrum with the weighted li- 
test, while the unpulsed significance was derived from a likelihood 
ratio test as described in the text. For nearly all sources, the 
pulsations are more strongly detected than the DC emission. 

given by 

o-dc = ^2 x log£ opt /£ , (11) 

with £ op t the likelihood value obtained with the best-fit 
flux density and £0 the likelihood value obtained with 
the flux density set to zero. It should be noted that 
a dc is a one-sided significance, so the same value of o~dc 
corresponds to a slightly higher value the "cr" for the 
pulsed detection. This discrepancy is small compared to 
the observed difference in significance. 

As in the previous section, we calculate the probability 
weights using both the Monte Carlo truth values of the 
parameter and with the best-fit spectrum — in this case, 
with cutoff energy and photon index fixed — to estimate 
a pulsed significance with the i?20t« test. 

We compare the unpulsed and pulsed significances 
in Figure [9l The majority of ensemble members are 
detected more significantly through pulsations than 
through unpulsed emission. The measured significance 
for pulsed detections is a factor of 2.2 greater than for DC 
detection. If we assume that the detection flux threshold 
is linear in aoc as it is in the pulsed significance, this 
means we require sources to be about twice as bright 
on average to detect them through DC emission rather 
than pulsed emission, though the factor for a particular 
pulsar depends strongly on the light curve morphology, 
especially the sharpness of the peak(s). 

This result also suggests a computationally efficient 
method for calculating pulsed flux upper limits. The av- 
erage "efficiency" factor (2.2 above) can be determined 
for some family of light curve templates, and pulsed flux 
upper limits can be determined directly from DC upper 
limits, which may themselves be straightforwardly calcu- 
lated from the data with maximum likelihood methods. 

Finally, we note that H20W calculated with weights 
obtained from a spectral fit using only one degree of 
freedom — the flux density — is comparable to that ob- 
tained when calculated with both the flux density and 
the photon index allowed to vary. This suggests an insen- 
sitivity to the precise spectral shape assumed for calcula- 
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tion of the weights. This result has an important practi- 
cal implication since the uncertainties on po wer law pa- 
ramet ers can be quite large for dim sources ([Abdo et al.l 
l2010aj) . in which case assuming a fixed, canonical spec- 
tral shape is likely to yield better results than fitting both 
flux density and power law slope. 

5. DISCUSSION 

We have demonstrated that incorporating probability 
weights into existing pulsation tests appreciably increases 
their sensitivity and robustness. This increase has non- 
trivial implications for the detected pulsar population. 
E.g., the population of pulsars detected so far is ob- 
serve d to have a slope of about —1 on a Log N-Log S 
plot (|Abdo et al.N2010d) . meaning that the increase in 
sensitivity translates directly to an increase in the de- 
tected population. That is, the adoption of weighted 
statistics can increase the number of pulsars detected by 
Fermi-LAT by 50-100%. The technique will work even 
better relative to unweighted tests in the central regions 
of the Galaxy, within 1° of the Galactic plane and within 
90° longitude of the Galactic center, where the projected 
source density is high, leading to ample discovery oppor- 
tunity but significant source confusion. These back-of- 
the-envelope arguments suggest Fermi-LAT may be able 
to detect, by the end of its mission, the full population 
of radio-timed pulsars whose 7-ray beams intersect the 
earth. 

Increasing the population size is important for at- 
tempts to understand the radiation mechanism of pul- 
sars. The light curves predicted by models of magne- 
tosphcric emission depend strongly on the configuration 
of the magnetic field (represented by the angle between 
the magnetic dipole and the neutron star spin axis) and 
the viewing angle, the angle between the spin axis and 
the line of sight. Although these quantities can some- 
times b e estimated by, e.g., ra dio polarization measure- 
ments (jWeltevrede et al.l l2010h or X -ray observation o f 
symmetrical pulsar wind nebulae (|Ng fc Romanil [20041) . 
they are more often nuisance parameters. By analyzing 
a large population simultaneously, one samples many re- 
alizations of the geometry and effectively marginalizes 
these nuisance parameters, allowing robust inference of 
the model parameters. Although pulsars detected by in- 
creasing the sensitivity will be certainly be faint, even 
the identification of crude features such as the number of 
peaks and their separation can si gnificantly constrain the 
allowed geometry for the pulsar (jWatters et al.l l2009). 

Besides improving prospects for detecting the known 
pulsar population, the machinery developed here to as- 
sess statistical performance can be useful for population 
synthesis, i.e., placing the strongest constraints possible 
on the 7-ray flux from radio-loud pulsars. In the material 
above, we saw that weighted statistics outperform both 
unweighted versions of the same statistics and unpulsed 
significance tests. Thus, pulsed sensitivity may be the 
best tool for constraining the 7-ray emission. Although 
much work has been done o n extracting analytic pulsed 
upper limits (jde Jagerll994 e.g.), the weighted method is 
not particularly amenable to this approach. Each source 
comes with its own set of probability weights that de- 
pends strongly on the pulsar spectrum and its position 
on the sky. With this complication, it makes sense to in- 
stead explore the sensitivity as a function of light curve 



shape (and perhaps spectrum) using an ensemble of sim- 
ulated sources as we have done here. Although we have 
only considered a small number of positions, this exercise 
can in principle be scaled to a sufficiently fine tessellation 
of the sky to provide a pulsed sensitivity map for the full 
sky. Finally, this method can be specialized to provide 
pulsed flux upper limits for particular known pulsars by 
combining simulated data for a candidate pulsar with the 
data observed by the Fermi-LAT. 

6. CONCLUSION 

We have shown that existing tests for pulsation gain 
greatly in sensitivity through incorporation of probabil- 
ity weights. The new versions retain their asymptotic 
calibration, are insensitive to how data are selected, and 
allow detection of pulsars fainter by a factor of 1.5 to 2. 
Taken together, these results represent a significant gain 
in the search for pulsed 7 rays from pulsars and sug- 
gest the adoption of weighted statistics — in particular, 
H mw — for omnibus pulsation tests. 
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APPENDIX 



THE ASYMPTOTIC NULL DISTRIBUTION OF H M 
Recall the H m statistic is defined as 

max[Zf - c(i - 1)] = max[I,], 1 < i < m. (Al) 



H, 



In its original formulation, m = 20 and c = 4 > sup- 
presses contributions from the higher harmonics in the 
null case. Let X$ = Z^ — c(i — 1). (For convenience, 
Xo = c.) Then H m is an extreme order statistic of 
the Xi, a collection of m dependent, non-identically dis- 
tributed rvs. 



The Joint Probability Density Function of X 

Assuming Z^ ~ x\ r m can be obtained from Xi 

by adding a x\ distributed variable and subtracting c: 



(A2) 



Let X m be a random vector in K m , such that H m is 
the maximum element of X m . We construct the joint pdf 
for A as a product of conditional distributions: 

f X m ( x rn) ~ fx m \X m -t \ x m\Xm— l) 

X ^X m _i|X m _ 2 \ x m—\\%m—2) x ' ' ' 

x Jx 2 \x 1 {x2\xi) x f Xx (xi) 



Y[xl(xi 



(A3) 
(A4) 



demonstrating the Markov property. 

Inserting the explicit form for x\( x ) — \ exp(— |) 0(x), 
where 6{x) is the Heaviside step function restricting sup- 
port to positive arguments, yields a significant simplifi- 
cation: 



+ c) 



exp 



where for convenience ct=\ exp (—•§)• 



(-1). 

(A5) 



The Cumulative Distribution Function of H m 

H m is just the maximum element of the vector X m . 
Thus, the probability to observe a value less than or equal 
to h m is simply the integral of the f%{x) over all values 
of x with all elements of x less than or equal to h m : 




fxL (Xm)- 



(A6) 



With the form obtained in Eq. IA5l the rhs becomes 



a" 



n 



dxi6 (xi — Xi-x + c) 



exp 



(-?) 



! A7 > 

While the integrand is simply an exponential in a single 
variable, the main difficulty in evaluating the expression 
lies in determining the support of the integrand. 

Reduction of FH m (h m ) 

We can develop the integral in Eq. IA7I recursively. 
First, we make a change of variables in the rightmost 
integral: u m = x m — x m —\ + c. This integral is then 
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The first term of the lhs is the cumulative distribution 
function for an m — 1 harmonic H-test, Fn m _ 1 , while the 
second term is a volume: 



F Hm (K 
where 



a 



x exp 



I„ 



(A9) 



In(h) = n 



dxi 9 (xi — Xi-i + c) 



Fully reducing Fu m yields a power series in a: 

m— 1 

F Hm (h m ) = 1 - exp i 



X ^ a ' lI n(hr. 
n=0 



(A10) 



Evaluation of I n 
We begin with a change of variables to eliminate the 
step functions. Let Ui = Xi — Y^i=i(xj ~ c )- Then 



In(h)=\[ 



i=l 



du, 



(All) 



Here, B. L = h + (i — l)c — Xj, and we note that 

B n = B n -i + c — u n —i. We can evaluate the n integrals 
recursively. With each integration, we make the change 
of integration variable to qi = P>i + (n — i + \ )c — Uj. For 
instance, evaluating the rightmost integral, we have 
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Figure 10. The survival function (1 — F(H)) of the asymptotic 
distribution for H for a sample (n = 10 9 ) drawn from the null 
distribution by simulation. To reduce the scale, we divide by the 
survival function for a \ 2 variable with two degrees of freedom, 
exp(— 0.5a;). For each maximum harmonic, the sample distribu- 
tions agree with the asymptotic calibration. 

= Y\_ [J duij J du n -iB n --i +c-u„-i 




This form is typical as one continues to integrate. The 
change of variable always produces a monomial in the 
integration variable. The upper boundary then produces 
a term in the integration variable of the next integral 
while the lower boundary produces a monomial of c. This 
separation allows a recursive development for /„, and 
combining the recursive terms yields 

,.<„ = M. (A13) 

J=l J 
We note that Io(h) = 1 and I\(h) = h. 

Monte Carlo Validation 

We validated our results with Monte Carlo simulations 
of the H20 statistic in the asymptotic null case. Specifi- 
cally, for each realization of H 2 o, we drew 20 realizations 
from a x 2 distribution with one degree of freedom and 
determined H accordingly (with c — 4.) In Figure [TOl we 
show the results of 10 9 Monte Carlo trials for a variety of 
maximum harmonics. The results are in good agreement 
with the asymptotic distribution derived here. 



General Properties for to > 10 

Although the asymptotic calibration fo r H^n has been 
previously characterized by Mont e Carlo (|de Jager et al.l 
Il989t Ide Jager fc Buschindl20100 . the analytic solution 
extends the calibration to an arbitrary collection of har- 
monics. (The method used here to develop the asymp- 
totic distribution can be easily extended to cases where 
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Figure 11. The survival function for the .ff-test for a vari- 
ety of maximum harmonics. The values have been scaled by 
exp(— 0.398405 H m ), which is seen to provide an excellent approx- 
imation to the survival function for large m. 

the harmonics are not sequential, e.g. a test incorporat- 
ing only the 2nd, 10th, and 20th harmonics. However, 
for simplicity's sake we retain the original formulation of 
an inclusive set of harmonics.) 

It is apparent from Figure[TU]that the distributions ap- 
proach a limiting distribution as m, the maximum har- 
monic, becomes large. Indeed, as we see in Figure 1 1 1 1 
for to > 10, there is very little difference in the tail prob- 
ability for "practical" values of H. For instance, the 
chance probabilities of observing H m > 50 for any har- 
monic to > 10 are all within a factor of 2 of each other, 
a negligible distinction at this significance level, and for 
H m < 50, the discrepancy is even smaller. We conclude 
that, once one has made the decision to choose to large 
enough to allow for sharply-peaked light curves — and in- 
deed, this is the whole point of the .ff-test — that there 
is no penalty for making to as large as feasible. In this 
sense, the H-test becomes truly omnibus, as to can be 
chosen large enough to make the test sensitive to light 
curves with arbitrarily sharp features. Finally, we note 
a practical formula for the cdf applicable for to > 10: 
1 - F{H) exp(-0.398405iJ). T his is, naturally, quite 
close to the formula reported by Ide Jager fc BuschinsJ 
(2010). For to < 10, quadratic terms in the exponen- 
tial are important for an accurate evaluation of the tail 
probability and the full expression Eq. IA10I should be 
used. 



